AITopics | training agent

Collaborating Authors

training agent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Supplementary Materials A Organization of Supplementary Materials

Neural Information Processing SystemsNov-15-2025, 06:18:22 GMT

The supplementary materials consist of five main sections. In Appendix B, we give a detailed overview of the related literature. Proofs for Section 3. In Appendix C, we give the proofs of Theorem 1 and Proposition 1. Algorithm and Implementation Details. In Appendix D, we provide further details about the implementation and training procedure for PerSim and the RL methods we benchmark against. In Appendix E, we detail the setup used to run our experiments.

agent, cartpole, persim, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Supplementary Materials A Organization of Supplementary Materials

Neural Information Processing SystemsAug-16-2025, 07:47:29 GMT

agent, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Towards Internet-Scale Training For Agents

Trabucco, Brandon, Sigurdsson, Gunnar, Piramuthu, Robinson, Salakhutdinov, Ruslan

arXiv.org Artificial IntelligenceFeb-10-2025

The predominant approach for training web navigation agents gathers human demonstrations for a set of popular websites and hand-written tasks, but it is becoming clear that human data are an inefficient resource. We develop a pipeline to facilitate Internet-scale training for agents without laborious human annotations. In the first stage, an LLM generates tasks for 150k diverse websites. In the next stage, LLM agents complete tasks and produce trajectories. In the final stage, an LLM reviews the trajectories and judges their success. Language models are competitive with human annotators, detecting and filtering out harmful content with an accuracy of 97%, generating feasible tasks with an 89% rate, and judging successful trajectories with an 82.6% accuracy. Scaling the pipeline, agents based on Llama 3.1 70B solve 16.7% of tasks for 150k sites. Training on the data generated by our pipeline is competitive with training on human demonstrations. In data-limited settings derived from Mind2Web and WebLINX, we improve Step Accuracy by up to +89.5% and +122.1% respectively for agents trained on mixtures of data from our pipeline, and human data. When training agents with all available human data from these benchmarks, agents fail to generalize to diverse real sites, and adding our data improves their generalization by +149.0% for WebLINX and +156.3% for Mind2Web. Code will be available at: data-for-agents.github.io.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.06776

Country:

Europe > United Kingdom (0.14)
North America > United States > Virginia (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.92)
Education > Educational Setting > Online (0.67)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Pre-trained Language Models Improve the Few-shot Prompt Ability of Decision Transformer

Yang, Yu, Xu, Pan

arXiv.org Artificial IntelligenceAug-2-2024

Decision Transformer (DT) has emerged as a promising class of algorithms in offline reinforcement learning (RL) tasks, leveraging pre-collected datasets and Transformer's capability to model long sequences. Recent works have demonstrated that using parts of trajectories from training tasks as prompts in DT enhances its performance on unseen tasks, giving rise to Prompt-DT methods. However, collecting data from specific environments can be both costly and unsafe in many scenarios, leading to suboptimal performance and limited few-shot prompt abilities due to the data-hungry nature of Transformer-based models. Additionally, the limited datasets used in pre-training make it challenging for Prompt-DT type of methods to distinguish between various RL tasks through prompts alone. To address these challenges, we introduce the Language model-initialized Prompt Decision Transformer (LPDT), which leverages pre-trained language models for meta-RL tasks and fine-tunes the model using Low-rank Adaptation (LoRA). We further incorporate prompt regularization to effectively differentiate between tasks based on prompt feature representations. Our approach integrates pre-trained language model and RL tasks seamlessly. Extensive empirical studies demonstrate that initializing with a pre-trained language model significantly enhances the performance of Prompt-DT on unseen tasks compared to baseline methods.

decision transformer, language model, regularization, (13 more...)

arXiv.org Artificial Intelligence

2408.01402

Country: Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Multi-Agent Training for Pommerman: Curriculum Learning and Population-based Self-Play Approach

Huynh, Nhat-Minh, Cao, Hoang-Giang, Wu, I-Chen

arXiv.org Artificial IntelligenceJun-30-2024

Pommerman is a multi-agent environment that has received considerable attention from researchers in recent years. This environment is an ideal benchmark for multi-agent training, providing a battleground for two teams with communication capabilities among allied agents. Pommerman presents significant challenges for model-free reinforcement learning due to delayed action effects, sparse rewards, and false positives, where opponent players can lose due to their own mistakes. This study introduces a system designed to train multi-agent systems to play Pommerman using a combination of curriculum learning and population-based self-play. We also tackle two challenging problems when deploying the multi-agent training system for competitive games: sparse reward and suitable matchmaking mechanism. Specifically, we propose an adaptive annealing factor based on agents' performance to adjust the dense exploration reward during training dynamically. Additionally, we implement a matchmaking mechanism utilizing the Elo rating system to pair agents effectively. Our experimental results demonstrate that our trained agent can outperform top learning agents without requiring communication among allied agents.

agent, pommerman, training agent, (15 more...)

arXiv.org Artificial Intelligence

2407.00662

Country:

Asia > Taiwan (0.04)
North America > United States > New York (0.04)
Asia > Thailand (0.04)
Asia > South Korea (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment > Games > Computer Games (0.46)
Leisure & Entertainment > Games > Chess (0.38)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Reinforcement Learning Based Self-play and State Stacking Techniques for Noisy Air Combat Environment

Tasbas, Ahmet Semih, Sahin, Safa Onur, Ure, Nazim Kemal

arXiv.org Artificial IntelligenceMar-6-2023

Reinforcement learning (RL) has recently proven itself as a powerful instrument for solving complex problems and even surpassed human performance in several challenging applications. This signifies that RL algorithms can be used in the autonomous air combat problem, which has been studied for many years. The complexity of air combat arises from aggressive close-range maneuvers and agile enemy behaviors. In addition to these complexities, there may be uncertainties in real-life scenarios due to sensor errors, which prevent estimation of the actual position of the enemy. In this case, autonomous aircraft should be successful even in the noisy environments. In this study, we developed an air combat simulation, which provides noisy observations to the agents, therefore, make the air combat problem even more challenging. Thus, we present a state stacking method for noisy RL environments as a noise reduction technique. In our extensive set of experiments, the proposed method significantly outperforms the baseline algorithms in terms of the winning ratio, where the performance improvement is even more pronounced in the high noise levels. In addition, we incorporate a self-play scheme to our training process by periodically updating the enemy with a frozen copy of the training agent. By this way, the training agent performs air combat simulations to an enemy with smarter strategies, which improves the performance and robustness of the agents. In our simulations, we demonstrate that the self-play scheme provides important performance gains compared to the classical RL training.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.2514/6.2023-1077

2303.03068

Country:

Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.06)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.06)
Asia > Middle East > Republic of Türkiye > Ankara Province > Ankara (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.71)

Industry: Government > Military (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

On Multi-Agent Learning in Team Sports Games

Zhao, Yunqi, Borovikov, Igor, Rupert, Jason, Somers, Caedmon, Beirami, Ahmad

arXiv.org Artificial IntelligenceJun-25-2019

In recent years, reinforcement learning has been successful in solving video games from Atari to Star Craft II. However, the end-to-end model-free reinforcement learning (RL) is not sample efficient and requires a significant amount of computational resources to achieve superhuman level performance. Model-free RL is also unlikely to produce human-like agents for playtesting and gameplaying AI in the development cycle of complex video games. In this paper, we present a hierarchical approach to training agents with the goal of achieving human-like style and high skill level in team sports games. While this is still work in progress, our preliminary results show that the presented approach holds promise for solving the posed multi-agent learning problem.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

1906.10124

Country: North America > United States (0.28)

Genre: Research Report (0.70)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Games (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Winning Isn't Everything: Training Human-Like Agents for Playtesting and Game AI

Zhao, Yunqi, Borovikov, Igor, Beirami, Ahmad, Rupert, Jason, Somers, Caedmon, Harder, Jesse, Silva, Fernando de Mesentier, Kolen, John, Pinto, Jervis, Pourabolghasem, Reza, Chaput, Harold, Pestrak, James, Sardari, Mohsen, Lin, Long, Aghdaie, Navid, Zaman, Kazi

arXiv.org Artificial IntelligenceMar-25-2019

Recently, there have been several high-profile achievements of agents learning to play games against humans and beat them. We consider an alternative approach that instead addresses game design for a better player experience by training human-like game agents. Specifically, we study the problem of training game agents in service of the development processes of the game developers that design, build, and operate modern games. We highlight some of the ways in which we think intelligent agents can assist game developers to understand their games, and even to build them. Our early results using the proposed agent framework mark a few steps toward addressing the unique challenges that game developers face.

machine learning, natural language, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

1903.10545

Country:

Europe > Netherlands > Limburg > Maastricht (0.04)
North America > United States > New York > Kings County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
(6 more...)

Genre:

Overview (0.68)
Research Report (0.64)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology > Software (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Add feedback